-------------------------------------------------------------------------------------------
      name:  <unnamed>
       log:  D:\Dropbox\FIRSTPAPERDRAFTSANDPROGRAMS\CleanFiles4SubmissionUpload\LOGS\logcom
> pare3approaches.log
  log type:  text
 opened on:  28 Mar 2013, 16:38:36

.   use "$path\3underreport.dta",clear

. 
.   * if compared with sales reported to tax office ==>
.   * underrp_d : firm level underreporting by direct approach 
.   * underrp_n : firm level underreporting by indirect approach
.   * underrp_m : firm level underreporting by MIMIC approach
.   * underrp_ms: firm level underreporting in survey, by MIMIC approach
.  
.   *** ============================
.   *** PRODUCE TABLE 1 IN THE PAPER
.   *** ============================
.   *** matrix mTable1
.   egen cnt=rmiss(underrp_d w city_no fsize2003 sector0)

.   tabstat underrp_d [aw=w] if cnt==0,stats(n mean sd min max) by(city_no) nototal save 

Summary for variables: underrp_d
     by categories of: city_no (city)

    city_no |         N      mean        sd       min       max
------------+--------------------------------------------------
Ulaanbaatar |       178  37.62226  29.21901         0        95
    Darkhan |        44  34.60675  26.62167         0        85
    Edernet |        46  43.46673  25.96465         0        97
       Hovd |        27  33.20124   30.0495         0        97
---------------------------------------------------------------

.   matrix mcity=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(Stat4)'

.   tabstat underrp_d [aw=w] if cnt==0,stats(n mean sd min max) by(sector0) nototal save 

Summary for variables: underrp_d
     by categories of: sector0 (industry)

     sector0 |         N      mean        sd       min       max
-------------+--------------------------------------------------
 manufacture |       145  37.71136   29.4579         0        97
construction |        73  39.07806  28.22217         0        97
     service |        55  34.09879  29.70412         0        90
     tourism |        22   38.2622  28.01119         0        85
----------------------------------------------------------------

.   matrix msect=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(Stat4)' 

.   tabstat underrp_d [aw=w] if cnt==0,stats(n mean sd min max) by(fsize2003) save 

Summary for variables: underrp_d
     by categories of: fsize2003 (firm size)

       fsize2003 |         N      mean        sd       min       max
-----------------+--------------------------------------------------
 small < 10 emps |        91  37.43232  30.64235         0        97
medium 10 <= x < |       175  37.17401  28.06215         0        97
    large >= 100 |        29   41.4869  29.49799         0        90
-----------------+--------------------------------------------------
           Total |       295  37.63719  28.85596         0        97
--------------------------------------------------------------------

.   matrix msize=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(StatTotal)'

.   matrix mTable1=mcity\msect\msize

.   matrix drop mcity msect msize

.   matrix list mTable1

mTable1[12,5]
                   N       mean         sd        min        max
underrp_d        178  37.622262  29.219014          0         95
underrp_d         44  34.606751  26.621673          0         85
underrp_d         46  43.466733  25.964654          0         97
underrp_d         27  33.201238  30.049502          0         97
underrp_d        145  37.711358  29.457896          0         97
underrp_d         73  39.078063  28.222172          0         97
underrp_d         55  34.098787  29.704116          0         90
underrp_d         22  38.262195  28.011187          0         85
underrp_d         91  37.432321  30.642348          0         97
underrp_d        175  37.174014  28.062149          0         97
underrp_d         29  41.486899  29.497988          0         90
underrp_d        295  37.637191  28.855965          0         97

.   
.   
.   
.   *logout, save("$path2\table1") word replace
.   xml_tab mTable1,save("$path2\table") sheet(table1) /// 
>   title(Table 1 Mean % of underreporting in sales by the direct approach) ///
>   rnames(Ulaanbaatar Darkan Erdenet Hovd Manufacture Construction Tourism ///
>   Service Small(<10) Medium(10-99) Large(>=100) Total) ///
>   notes(Note: top/bottom 5% are trimmed off; ///
>   figures are weighted by sampling weights; * denotes significance at 10% ///
>   Source: WB PICS Mongolia (2004)) font("Times New Roman" 12) updateopts replace


note: results saved to D:\Dropbox\FIRSTPAPERDRAFTSANDPROGRAMS\CleanFiles4SubmissionUpload\T
> ABLES\table.xml

.   
.   
.   
.  
.   *** perform multiple group mean test ***
.   
.   foreach var of varlist city_no sector0 {
  2.      tabstat underrp_d if cnt==0 [aw=w], by(`var') stat(n mean sd) save
  3.          matrix rall=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(Stat4)'
  4.          
.          forvalue i=1/4 {
  5.            local n`i'=rall[`i',1]
  6.            local m`i'=rall[`i',2]
  7.            local sd`i'=rall[`i',3] 
  8.          }
  9.          ttesti `n1' `m1' `sd1' `n2' `m2' `sd2',unequal
 10.          ttesti `n1' `m1' `sd1' `n3' `m3' `sd3',unequal
 11.          ttesti `n1' `m1' `sd1' `n4' `m4' `sd4',unequal 
 12.   }

Summary for variables: underrp_d
     by categories of: city_no (city)

    city_no |         N      mean        sd
------------+------------------------------
Ulaanbaatar |       178  37.62226  29.21901
    Darkhan |        44  34.60675  26.62167
    Edernet |        46  43.46673  25.96465
       Hovd |        27  33.20124   30.0495
------------+------------------------------
      Total |       295  37.63719  28.85596
-------------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     178    37.62226    2.190058    29.21901    33.30028    41.94425
       y |      44    34.60675    4.013368    26.62167    26.51302    42.70048
---------+--------------------------------------------------------------------
combined |     222    37.02459     1.92555    28.69006     33.2298    40.81938
---------+--------------------------------------------------------------------
    diff |            3.015511    4.572032               -6.101097    12.13212
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   0.6596
Ho: diff = 0                     Satterthwaite's degrees of freedom =  70.8946

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.7442         Pr(|T| > |t|) = 0.5117          Pr(T > t) = 0.2558

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     178    37.62226    2.190058    29.21901    33.30028    41.94425
       y |      46    43.46673    3.828279    25.96465    35.75618    51.17728
---------+--------------------------------------------------------------------
combined |     224    38.82247    1.912461     28.6231    35.05366    42.59127
---------+--------------------------------------------------------------------
    diff |           -5.844472    4.410451               -14.62649    2.937545
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -1.3251
Ho: diff = 0                     Satterthwaite's degrees of freedom =  77.1724

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0945         Pr(|T| > |t|) = 0.1890          Pr(T > t) = 0.9055

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     178    37.62226    2.190058    29.21901    33.30028    41.94425
       y |      27    33.20124    5.783029     30.0495    21.31405    45.08843
---------+--------------------------------------------------------------------
combined |     205    37.03998    2.045918    29.29309    33.00612    41.07384
---------+--------------------------------------------------------------------
    diff |            4.421024    6.183832               -8.147537    16.98958
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   0.7149
Ho: diff = 0                     Satterthwaite's degrees of freedom =  33.8901

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.7602         Pr(|T| > |t|) = 0.4795          Pr(T > t) = 0.2398

Summary for variables: underrp_d
     by categories of: sector0 (industry)

     sector0 |         N      mean        sd
-------------+------------------------------
 manufacture |       145  37.71136   29.4579
construction |        73  39.07806  28.22217
     service |        55  34.09879  29.70412
     tourism |        22   38.2622  28.01119
-------------+------------------------------
       Total |       295  37.63719  28.85596
--------------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     145    37.71136    2.446345     29.4579    32.87597    42.54674
       y |      73    39.07806    3.303155    28.22217    32.49334    45.66278
---------+--------------------------------------------------------------------
combined |     218    38.16902    1.963587    28.99202    34.29887    42.03916
---------+--------------------------------------------------------------------
    diff |           -1.366705    4.110406               -9.488448    6.755037
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -0.3325
Ho: diff = 0                     Satterthwaite's degrees of freedom =  150.071

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.3700         Pr(|T| > |t|) = 0.7400          Pr(T > t) = 0.6300

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     145    37.71136    2.446345     29.4579    32.87597    42.54674
       y |      55    34.09879    4.005302    29.70412    26.06864    42.12893
---------+--------------------------------------------------------------------
combined |     200     36.7179    2.085635    29.49534    32.60512    40.83068
---------+--------------------------------------------------------------------
    diff |            3.612571    4.693298               -5.702625    12.92777
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   0.7697
Ho: diff = 0                     Satterthwaite's degrees of freedom =  96.7548

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.7783         Pr(|T| > |t|) = 0.4433          Pr(T > t) = 0.2217

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     145    37.71136    2.446345     29.4579    32.87597    42.54674
       y |      22     38.2622    5.972005    28.01119    25.84273    50.68166
---------+--------------------------------------------------------------------
combined |     167    37.78392    2.258791    29.19002    33.32426    42.24359
---------+--------------------------------------------------------------------
    diff |           -.5508375    6.453638               -13.75963    12.65795
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -0.0854
Ho: diff = 0                     Satterthwaite's degrees of freedom =  28.5218

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.4663         Pr(|T| > |t|) = 0.9326          Pr(T > t) = 0.5337

.         
. foreach var of varlist fsize2003 {
  2.      tabstat underrp_d if cnt==0 [aw=w], by(`var') stat(n mean sd) save
  3.          matrix rall=r(Stat1)'\r(Stat2)'\r(Stat3)'
  4.          
.          forvalue i=1/3 {
  5.            local n`i'=rall[`i',1]
  6.            local m`i'=rall[`i',2]
  7.            local sd`i'=rall[`i',3] 
  8.          }
  9.          ttesti `n1' `m1' `sd1' `n2' `m2' `sd2',unequal
 10.          ttesti `n1' `m1' `sd1' `n3' `m3' `sd3',unequal
 11.           
.   }     

Summary for variables: underrp_d
     by categories of: fsize2003 (firm size)

       fsize2003 |         N      mean        sd
-----------------+------------------------------
 small < 10 emps |        91  37.43232  30.64235
medium 10 <= x < |       175  37.17401  28.06215
    large >= 100 |        29   41.4869  29.49799
-----------------+------------------------------
           Total |       295  37.63719  28.85596
------------------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |      91    37.43232    3.212191    30.64235    31.05074     43.8139
       y |     175    37.17401    2.121299    28.06215    32.98722     41.3608
---------+--------------------------------------------------------------------
combined |     266    37.26238    1.772779    28.91315    33.77186    40.75291
---------+--------------------------------------------------------------------
    diff |            .2583063    3.849426               -7.340849    7.857462
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   0.0671
Ho: diff = 0                     Satterthwaite's degrees of freedom =  168.993

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.5267         Pr(|T| > |t|) = 0.9466          Pr(T > t) = 0.4733

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |      91    37.43232    3.212191    30.64235    31.05074     43.8139
       y |      29     41.4869    5.477639    29.49799    30.26646    52.70733
---------+--------------------------------------------------------------------
combined |     120    38.41218    2.765724    30.29699    32.93577    43.88859
---------+--------------------------------------------------------------------
    diff |           -4.054579    6.350016               -16.81691    8.707748
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -0.6385
Ho: diff = 0                     Satterthwaite's degrees of freedom =  48.7744

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.2631         Pr(|T| > |t|) = 0.5261          Pr(T > t) = 0.7369

.   drop cnt

.    
.   *** ============================
.   *** PRODUCE TABLE 2 IN THE PAPER
.   *** ============================
.   *** matrix mTable2
.   egen cnt=rmiss(underrp_n w city_no fsize2003 sector0)

.   count if cnt==0&underrp_n<0
   96

.   tabstat underrp_n [aw=w] if cnt==0,stats(n mean sd min max) by(city_no) nototal save 

Summary for variables: underrp_n
     by categories of: city_no (city)

    city_no |         N      mean        sd       min       max
------------+--------------------------------------------------
Ulaanbaatar |       139  14.21073  42.98594 -159.2876  91.26743
    Darkhan |        51  19.70475  55.63968 -121.1875  91.82417
    Edernet |        46   13.4734  54.22435 -154.3167  90.09039
       Hovd |        20  16.42181  49.83532 -132.7835      86.7
---------------------------------------------------------------

.   matrix mcity=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(Stat4)'

.   tabstat underrp_n [aw=w] if cnt==0,stats(n mean sd min max) by(sector0) nototal save 

Summary for variables: underrp_n
     by categories of: sector0 (industry)

     sector0 |         N      mean        sd       min       max
-------------+--------------------------------------------------
 manufacture |       133  17.03885  48.70524 -159.2876  91.82417
construction |        66   12.4025  47.11541 -154.3167   91.2729
     service |        44  19.99456  36.98602     -96.8  90.83571
     tourism |        13 -2.060973  16.76303 -22.36989  64.86562
----------------------------------------------------------------

.   matrix msect=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(Stat4)' 

.   tabstat underrp_n [aw=w] if cnt==0,stats(n mean sd min max) by(fsize2003) save 

Summary for variables: underrp_n
     by categories of: fsize2003 (firm size)

       fsize2003 |         N      mean        sd       min       max
-----------------+--------------------------------------------------
 small < 10 emps |        70  26.24064  42.32289 -121.1875  91.82417
medium 10 <= x < |       157  12.48073  48.29921 -159.2876   91.2729
    large >= 100 |        29  4.351224  28.34511   -80.933  71.46745
-----------------+--------------------------------------------------
           Total |       256  14.75775  45.40728 -159.2876  91.82417
--------------------------------------------------------------------

.   matrix msize=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(StatTotal)'

.   matrix mTable2=mcity\msect\msize

.   matrix drop mcity msect msize

.   matrix list mTable2

mTable2[12,5]
                    N        mean          sd         min         max
underrp_n         139   14.210726   42.985936  -159.28763   91.267426
underrp_n          51   19.704752    55.63968   -121.1875   91.824165
underrp_n          46   13.473399   54.224353  -154.31667   90.090385
underrp_n          20   16.421806   49.835316  -132.78354   86.699997
underrp_n         133   17.038848   48.705235  -159.28763   91.824165
underrp_n          66   12.402497   47.115408  -154.31667   91.272903
underrp_n          44   19.994563   36.986025  -96.800003   90.835709
underrp_n          13  -2.0609731   16.763025  -22.369892   64.865623
underrp_n          70   26.240643   42.322887   -121.1875   91.824165
underrp_n         157   12.480727   48.299211  -159.28763   91.272903
underrp_n          29   4.3512239    28.34511  -80.932999   71.467445
underrp_n         256   14.757746   45.407284  -159.28763   91.824165

.  
.   
.   *logout, save("$path2\table2") word replace
.  
.   xml_tab mTable2,save("$path2\table") sheet(table2) ///
>   title(Table 2 Average % of Sales Underreported by the Indirect Approach) ///
>   rnames(Ulaanbaatar Darkan Erdenet Hovd Manufacture Construction Tourism ///
>   Service Small(<10) Medium(10-99) Large(>=100) Total) ///
>   notes(Note: top/bottom 5% are trimmed off; figures are weighted by sampling weights; *,
>  ** ///
>   denote significance at 10% and 5% respectively; ///
>   Source: WB PICS Mongolia (2004) & Tax Office Data 2003) ///
>   font("Times New Roman" 12) append


note: results saved to D:\Dropbox\FIRSTPAPERDRAFTSANDPROGRAMS\CleanFiles4SubmissionUpload\T
> ABLES\table.xml

.   
.   *** perform multiple group mean test ***
.   
.  foreach var of varlist city_no sector0 {
  2.      tabstat underrp_n if cnt==0 [aw=w], by(`var') stat(n mean sd) save
  3.          matrix rall=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(Stat4)'
  4.          
.          forvalue i=1/4 {
  5.            local n`i'=rall[`i',1]
  6.            local m`i'=rall[`i',2]
  7.            local sd`i'=rall[`i',3] 
  8.          }
  9.          ttesti `n1' `m1' `sd1' `n2' `m2' `sd2',unequal
 10.          ttesti `n1' `m1' `sd1' `n3' `m3' `sd3',unequal
 11.          ttesti `n1' `m1' `sd1' `n4' `m4' `sd4',unequal 
 12.   }

Summary for variables: underrp_n
     by categories of: city_no (city)

    city_no |         N      mean        sd
------------+------------------------------
Ulaanbaatar |       139  14.21073  42.98594
    Darkhan |        51  19.70475  55.63968
    Edernet |        46   13.4734  54.22435
       Hovd |        20  16.42181  49.83532
------------+------------------------------
      Total |       256  14.75775  45.40728
-------------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     139    14.21073     3.64602    42.98594    7.001439    21.42001
       y |      51    19.70475    7.791114    55.63968    4.055839    35.35366
---------+--------------------------------------------------------------------
combined |     190    15.68544    3.382717    46.62754    9.012708    22.35817
---------+--------------------------------------------------------------------
    diff |           -5.494025    8.602029               -22.63773    11.64968
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -0.6387
Ho: diff = 0                     Satterthwaite's degrees of freedom =  73.0287

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.2625         Pr(|T| > |t|) = 0.5250          Pr(T > t) = 0.7375

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     139    14.21073     3.64602    42.98594    7.001439    21.42001
       y |      46     13.4734    7.994945    54.22435   -2.629247    29.57604
---------+--------------------------------------------------------------------
combined |     185    14.02739    3.373214    45.88067    7.372241    20.68254
---------+--------------------------------------------------------------------
    diff |            .7373278     8.78707               -16.81294     18.2876
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   0.0839
Ho: diff = 0                     Satterthwaite's degrees of freedom =  64.7507

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.5333         Pr(|T| > |t|) = 0.9334          Pr(T > t) = 0.4667

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     139    14.21073     3.64602    42.98594    7.001439    21.42001
       y |      20    16.42181    11.14352    49.83532    -6.90184    39.74545
---------+--------------------------------------------------------------------
combined |     159    14.48885    3.468719    43.73888     7.63781    21.33989
---------+--------------------------------------------------------------------
    diff |            -2.21108    11.72482               -26.45135    22.02919
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -0.1886
Ho: diff = 0                     Satterthwaite's degrees of freedom =   23.249

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.4260         Pr(|T| > |t|) = 0.8521          Pr(T > t) = 0.5740

Summary for variables: underrp_n
     by categories of: sector0 (industry)

     sector0 |         N      mean        sd
-------------+------------------------------
 manufacture |       133  17.03885  48.70524
construction |        66   12.4025  47.11541
     service |        44  19.99456  36.98602
     tourism |        13 -2.060973  16.76303
-------------+------------------------------
       Total |       256  14.75775  45.40728
--------------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     133    17.03885    4.223279    48.70524    8.684784    25.39291
       y |      66     12.4025    5.799506    47.11541    .8200836    23.98491
---------+--------------------------------------------------------------------
combined |     199    15.50116    3.410741    48.11443    8.775122     22.2272
---------+--------------------------------------------------------------------
    diff |            4.636351    7.174284               -9.553421    18.82612
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   0.6462
Ho: diff = 0                     Satterthwaite's degrees of freedom =  133.703

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.7404         Pr(|T| > |t|) = 0.5192          Pr(T > t) = 0.2596

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     133    17.03885    4.223279    48.70524    8.684784    25.39291
       y |      44    19.99456    5.575853    36.98602    8.749784    31.23934
---------+--------------------------------------------------------------------
combined |     177     17.7736    3.456763    45.98924    10.95156    24.59564
---------+--------------------------------------------------------------------
    diff |           -2.955716    6.994729               -16.83981    10.92838
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -0.4226
Ho: diff = 0                     Satterthwaite's degrees of freedom =  96.1779

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.3368         Pr(|T| > |t|) = 0.6736          Pr(T > t) = 0.6632

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     133    17.03885    4.223279    48.70524    8.684784    25.39291
       y |      13   -2.060973    4.649227    16.76303   -12.19077    8.068822
---------+--------------------------------------------------------------------
combined |     146    15.33818    3.892889    47.03796    7.644041    23.03232
---------+--------------------------------------------------------------------
    diff |            19.09982    6.281035                6.380585    31.81906
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   3.0409
Ho: diff = 0                     Satterthwaite's degrees of freedom =  37.6444

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.9979         Pr(|T| > |t|) = 0.0043          Pr(T > t) = 0.0021

.         
. foreach var of varlist fsize2003 {
  2.      tabstat underrp_n if cnt==0 [aw=w], by(`var') stat(n mean sd) save
  3.          matrix rall=r(Stat1)'\r(Stat2)'\r(Stat3)'
  4.          
.          forvalue i=1/4 {
  5.            local n`i'=rall[`i',1]
  6.            local m`i'=rall[`i',2]
  7.            local sd`i'=rall[`i',3] 
  8.          }
  9.          ttesti `n1' `m1' `sd1' `n2' `m2' `sd2',unequal
 10.          ttesti `n1' `m1' `sd1' `n3' `m3' `sd3',unequal
 11.           
.   }     

Summary for variables: underrp_n
     by categories of: fsize2003 (firm size)

       fsize2003 |         N      mean        sd
-----------------+------------------------------
 small < 10 emps |        70  26.24064  42.32289
medium 10 <= x < |       157  12.48073  48.29921
    large >= 100 |        29  4.351224  28.34511
-----------------+------------------------------
           Total |       256  14.75775  45.40728
------------------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |      70    26.24064    5.058553    42.32289    16.14911    36.33218
       y |     157    12.48073    3.854697    48.29921    4.866593    20.09486
---------+--------------------------------------------------------------------
combined |     227    16.72387     3.11151    46.87963    10.59259    22.85515
---------+--------------------------------------------------------------------
    diff |            13.75992    6.359846                1.193478    26.32635
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   2.1636
Ho: diff = 0                     Satterthwaite's degrees of freedom =  150.023

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.9840         Pr(|T| > |t|) = 0.0321          Pr(T > t) = 0.0160

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |      70    26.24064    5.058553    42.32289    16.14911    36.33218
       y |      29    4.351224    5.263555    28.34511   -6.430679    15.13313
---------+--------------------------------------------------------------------
combined |      99    19.82859    4.008803    39.88709    11.87325    27.78393
---------+--------------------------------------------------------------------
    diff |            21.88942    7.300271                7.352617    36.42622
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   2.9984
Ho: diff = 0                     Satterthwaite's degrees of freedom =  76.9654

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.9982         Pr(|T| > |t|) = 0.0037          Pr(T > t) = 0.0018

.   drop cnt

.  
.   replace underrp_d = underrp_d/100
(230 real changes made)

.   replace underrp_n = underrp_n/100
(247 real changes made)

.   g underrp_ms = 1 - sale_s/y_p
(133 missing values generated)

. 
.   egen rmiss=rowmiss(underrp_d underrp_n underrp_m)

.   sum underrp_* [aw=w] if rmiss==0

    Variable |     Obs      Weight        Mean   Std. Dev.       Min        Max
-------------+-----------------------------------------------------------------
   underrp_d |     187  445.508273    .3504845   .2809051          0        .97
   underrp_n |     187  445.508273    .1446992    .458843  -1.592876    .912729
   underrp_m |     187  445.508273    .3787979   .2064393   .0860808   .8913365
  underrp_ms |     187  445.508273    .1192932   .3731093  -1.587742   .6864506

.  
.   count if rmiss==0&underrp_d==0
   44

.   
.   *** produce TABLE 4 in the paper ***
.   *** Underreport to tax office and in the survey ///
>   *** across cities,sectors,size,corruption and credit constraint
.   *** ===========================================================
.   *** with sampling weights
.  
.   * to tax office
.   tabstat underrp_m [aw=w], by(city_no) stats(n mean sd) nototal save

Summary for variables: underrp_m
     by categories of: city_no (city)

    city_no |         N      mean        sd
------------+------------------------------
Ulaanbaatar |       126  .3689215  .2141801
    Darkhan |        50  .4660149  .2466382
    Edernet |        38   .461547  .2237498
       Hovd |        17  .3801461  .2046161
-------------------------------------------

.   *return list
.   matrix mcity=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(Stat4)'

.   tabstat underrp_m [aw=w], by(sector0) stats(n mean sd) nototal save

Summary for variables: underrp_m
     by categories of: sector0 (industry)

     sector0 |         N      mean        sd
-------------+------------------------------
 manufacture |       125  .4236386  .2404192
construction |        62  .3387826  .1836817
     service |        30  .3672198  .2165334
     tourism |        14  .4366113  .2422149
--------------------------------------------

.   matrix msect=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(Stat4)'

.   tabstat underrp_m [aw=w], by(fsize2003) stats(n mean sd) nototal save

Summary for variables: underrp_m
     by categories of: fsize2003 (firm size)

       fsize2003 |         N      mean        sd
-----------------+------------------------------
 small < 10 emps |        57  .4278063  .2447797
medium 10 <= x < |       152  .3738983  .2181151
    large >= 100 |        22  .3763976  .1636822
------------------------------------------------

.   matrix msize=r(Stat1)'\r(Stat2)'\r(Stat3)'

.   tabstat underrp_m [aw=w], by(bribe) stats(n mean sd) nototal save

Summary for variables: underrp_m
     by categories of: bribe (dummy,=1 if pay bribes)

   bribe |         N      mean        sd
---------+------------------------------
       0 |       121  .3526269  .2019427
       1 |       110  .4194766   .233395
----------------------------------------

.   matrix mbrib=r(Stat1)'\r(Stat2)'

.   tabstat underrp_m [aw=w], by(credit) stats(n mean sd) save

Summary for variables: underrp_m
     by categories of: credit (=1 if credit constrained)

  credit |         N      mean        sd
---------+------------------------------
       0 |       151  .3784394  .2128059
       1 |        80  .3991653  .2336256
---------+------------------------------
   Total |       231  .3859166  .2202516
----------------------------------------

.   matrix mcred=r(Stat1)'\r(Stat2)'\r(StatTotal)'

.  
.   matrix munderrp_tax=mcity\msect\msize\mbrib\mcred

.   matrix list munderrp_tax

munderrp_tax[16,3]
                   N       mean         sd
underrp_m        126  .36892146  .21418009
underrp_m         50  .46601492  .24663817
underrp_m         38  .46154703  .22374982
underrp_m         17  .38014608  .20461612
underrp_m        125   .4236386  .24041919
underrp_m         62  .33878261  .18368172
underrp_m         30  .36721976   .2165334
underrp_m         14  .43661133  .24221489
underrp_m         57  .42780633  .24477972
underrp_m        152  .37389833  .21811515
underrp_m         22  .37639765  .16368221
underrp_m        121  .35262692  .20194266
underrp_m        110  .41947655  .23339499
underrp_m        151  .37843935  .21280587
underrp_m         80  .39916531  .23362562
underrp_m        231  .38591665  .22025159

.   matrix drop mcity msect msize mbrib mcred

.   
.    *** perform multiple group mean test ***
.  foreach var of varlist city_no sector0 {
  2.      tabstat underrp_m [aw=w], by(`var') stat(n mean sd) save
  3.          matrix rall=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(Stat4)'
  4.          
.          forvalue i=1/4 {
  5.            local n`i'=rall[`i',1]
  6.            local m`i'=rall[`i',2]
  7.            local sd`i'=rall[`i',3] 
  8.          }
  9.          ttesti `n1' `m1' `sd1' `n2' `m2' `sd2',unequal
 10.          ttesti `n1' `m1' `sd1' `n3' `m3' `sd3',unequal
 11.          ttesti `n1' `m1' `sd1' `n4' `m4' `sd4',unequal 
 12.   }

Summary for variables: underrp_m
     by categories of: city_no (city)

    city_no |         N      mean        sd
------------+------------------------------
Ulaanbaatar |       126  .3689215  .2141801
    Darkhan |        50  .4660149  .2466382
    Edernet |        38   .461547  .2237498
       Hovd |        17  .3801461  .2046161
------------+------------------------------
      Total |       231  .3859166  .2202516
-------------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     126    .3689215    .0190807    .2141801    .3311584    .4066845
       y |      50    .4660149    .0348799    .2466382    .3959211    .5361087
---------+--------------------------------------------------------------------
combined |     176    .3965048    .0171437    .2274364    .3626699    .4303398
---------+--------------------------------------------------------------------
    diff |           -.0970935    .0397578               -.1762153   -.0179716
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -2.4421
Ho: diff = 0                     Satterthwaite's degrees of freedom =  79.9096

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0084         Pr(|T| > |t|) = 0.0168          Pr(T > t) = 0.9916

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     126    .3689215    .0190807    .2141801    .3311584    .4066845
       y |      38     .461547     .036297    .2237498    .3880023    .5350918
---------+--------------------------------------------------------------------
combined |     164    .3903835    .0171222    .2192708    .3565736    .4241933
---------+--------------------------------------------------------------------
    diff |           -.0926256    .0410067               -.1746814   -.0105698
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -2.2588
Ho: diff = 0                     Satterthwaite's degrees of freedom =  58.9424

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0138         Pr(|T| > |t|) = 0.0276          Pr(T > t) = 0.9862

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     126    .3689215    .0190807    .2141801    .3311584    .4066845
       y |      17    .3801461    .0496267    .2046161    .2749422      .48535
---------+--------------------------------------------------------------------
combined |     143    .3702559    .0177614     .212396    .3351448    .4053669
---------+--------------------------------------------------------------------
    diff |           -.0112246    .0531684               -.1217876    .0993383
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -0.2111
Ho: diff = 0                     Satterthwaite's degrees of freedom =  21.0213

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.4174         Pr(|T| > |t|) = 0.8348          Pr(T > t) = 0.5826

Summary for variables: underrp_m
     by categories of: sector0 (industry)

     sector0 |         N      mean        sd
-------------+------------------------------
 manufacture |       125  .4236386  .2404192
construction |        62  .3387826  .1836817
     service |        30  .3672198  .2165334
     tourism |        14  .4366113  .2422149
-------------+------------------------------
       Total |       231  .3859166  .2202516
--------------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     125    .4236386    .0215037    .2404192    .3810767    .4662005
       y |      62    .3387826    .0233276    .1836817    .2921362     .385429
---------+--------------------------------------------------------------------
combined |     187    .3955045    .0165474    .2262821    .3628598    .4281492
---------+--------------------------------------------------------------------
    diff |             .084856    .0317268                .0221802    .1475318
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   2.6746
Ho: diff = 0                     Satterthwaite's degrees of freedom =  154.009

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.9959         Pr(|T| > |t|) = 0.0083          Pr(T > t) = 0.0041

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     125    .4236386    .0215037    .2404192    .3810767    .4662005
       y |      30    .3672198    .0395334    .2165334    .2863649    .4480747
---------+--------------------------------------------------------------------
combined |     155    .4127188    .0189857      .23637    .3752128    .4502248
---------+--------------------------------------------------------------------
    diff |            .0564188    .0450034               -.0340801    .1469178
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   1.2537
Ho: diff = 0                     Satterthwaite's degrees of freedom =   47.722

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.8920         Pr(|T| > |t|) = 0.2161          Pr(T > t) = 0.1080

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     125    .4236386    .0215037    .2404192    .3810767    .4662005
       y |      14    .4366113    .0647347    .2422149    .2967606     .576462
---------+--------------------------------------------------------------------
combined |     139    .4249452    .0203352    .2397489    .3847363    .4651541
---------+--------------------------------------------------------------------
    diff |           -.0129727    .0682128               -.1575724    .1316269
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -0.1902
Ho: diff = 0                     Satterthwaite's degrees of freedom =  16.0068

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.4258         Pr(|T| > |t|) = 0.8516          Pr(T > t) = 0.5742

.         
. foreach var of varlist fsize2003 {
  2.      tabstat underrp_m [aw=w], by(`var') stat(n mean sd) save
  3.          matrix rall=r(Stat1)'\r(Stat2)'\r(Stat3)'
  4.          
.          forvalue i=1/3 {
  5.            local n`i'=rall[`i',1]
  6.            local m`i'=rall[`i',2]
  7.            local sd`i'=rall[`i',3] 
  8.          }
  9.          ttesti `n1' `m1' `sd1' `n2' `m2' `sd2',unequal
 10.          ttesti `n1' `m1' `sd1' `n3' `m3' `sd3',unequal
 11.           
.   }     

Summary for variables: underrp_m
     by categories of: fsize2003 (firm size)

       fsize2003 |         N      mean        sd
-----------------+------------------------------
 small < 10 emps |        57  .4278063  .2447797
medium 10 <= x < |       152  .3738983  .2181151
    large >= 100 |        22  .3763976  .1636822
-----------------+------------------------------
           Total |       231  .3859166  .2202516
------------------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |      57    .4278063    .0324219    .2447797    .3628575    .4927551
       y |     152    .3738983    .0176915    .2181151    .3389435    .4088531
---------+--------------------------------------------------------------------
combined |     209    .3886005     .015659    .2263797    .3577298    .4194712
---------+--------------------------------------------------------------------
    diff |             .053908    .0369346               -.0194547    .1272707
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   1.4596
Ho: diff = 0                     Satterthwaite's degrees of freedom =  91.3105

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.9261         Pr(|T| > |t|) = 0.1478          Pr(T > t) = 0.0739

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |      57    .4278063    .0324219    .2447797    .3628575    .4927551
       y |      22    .3763976    .0348972    .1636822     .303825    .4489703
---------+--------------------------------------------------------------------
combined |      79      .41349    .0253503    .2253185    .3630214    .4639586
---------+--------------------------------------------------------------------
    diff |            .0514087    .0476339               -.0439773    .1467947
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   1.0792
Ho: diff = 0                     Satterthwaite's degrees of freedom =  56.9794

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.8575         Pr(|T| > |t|) = 0.2850          Pr(T > t) = 0.1425

.   
.   foreach var of varlist credit bribe{
  2.      tabstat underrp_m [aw=w], by(`var') stat(n mean sd) save
  3.          matrix rall=r(Stat1)'\r(Stat2)'
  4.          
.          forvalue i=1/2 {
  5.            local n`i'=rall[`i',1]
  6.            local m`i'=rall[`i',2]
  7.            local sd`i'=rall[`i',3] 
  8.          }
  9.          ttesti `n1' `m1' `sd1' `n2' `m2' `sd2',unequal
 10.   }     

Summary for variables: underrp_m
     by categories of: credit (=1 if credit constrained)

  credit |         N      mean        sd
---------+------------------------------
       0 |       151  .3784394  .2128059
       1 |        80  .3991653  .2336256
---------+------------------------------
   Total |       231  .3859166  .2202516
----------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     151    .3784394    .0173179    .2128059    .3442208    .4126579
       y |      80    .3991653    .0261201    .2336256    .3471745    .4511561
---------+--------------------------------------------------------------------
combined |     231    .3856172    .0144719    .2199536    .3571028    .4141316
---------+--------------------------------------------------------------------
    diff |            -.020726    .0313396               -.0826548    .0412029
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -0.6613
Ho: diff = 0                     Satterthwaite's degrees of freedom =  148.597

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.2547         Pr(|T| > |t|) = 0.5094          Pr(T > t) = 0.7453

Summary for variables: underrp_m
     by categories of: bribe (dummy,=1 if pay bribes)

   bribe |         N      mean        sd
---------+------------------------------
       0 |       121  .3526269  .2019427
       1 |       110  .4194766   .233395
---------+------------------------------
   Total |       231  .3859166  .2202516
----------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     121    .3526269    .0183584    .2019427    .3162785    .3889753
       y |     110    .4194766    .0222533     .233395    .3753712    .4635819
---------+--------------------------------------------------------------------
combined |     231    .3844601    .0144468    .2195724    .3559951    .4129251
---------+--------------------------------------------------------------------
    diff |           -.0668496    .0288486               -.1237095   -.0099898
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -2.3173
Ho: diff = 0                     Satterthwaite's degrees of freedom =  216.687

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0107         Pr(|T| > |t|) = 0.0214          Pr(T > t) = 0.9893

.   
.  * in the survey
.   tabstat underrp_ms [aw=w], by(city_no) stats(n mean sd) nototal save

Summary for variables: underrp_ms
     by categories of: city_no (city)

    city_no |         N      mean        sd
------------+------------------------------
Ulaanbaatar |       126  .0937303  .3803728
    Darkhan |        50  .2314381   .299097
    Edernet |        38  .2470595  .2925276
       Hovd |        17  .0950854   .385837
-------------------------------------------

.  *return list
.   matrix mcity=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(Stat4)'

.   tabstat underrp_ms [aw=w], by(sector0) stats(n mean sd) nototal save

Summary for variables: underrp_ms
     by categories of: sector0 (industry)

     sector0 |         N      mean        sd
-------------+------------------------------
 manufacture |       125  .1331166  .3198381
construction |        62  .0530117  .4435399
     service |        30  .1411455  .2629061
     tourism |        14   .376261  .2481357
--------------------------------------------

.   matrix msect=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(Stat4)'

.   tabstat underrp_ms [aw=w], by(fsize2003) stats(n mean sd) nototal save

Summary for variables: underrp_ms
     by categories of: fsize2003 (firm size)

       fsize2003 |         N      mean        sd
-----------------+------------------------------
 small < 10 emps |        57  .0094081  .3069934
medium 10 <= x < |       152  .1260716  .3934107
    large >= 100 |        22  .3265347  .1991903
------------------------------------------------

.   matrix msize=r(Stat1)'\r(Stat2)'\r(Stat3)'

.   tabstat underrp_ms [aw=w], by(bribe) stats(n mean sd) nototal save

Summary for variables: underrp_ms
     by categories of: bribe (dummy,=1 if pay bribes)

   bribe |         N      mean        sd
---------+------------------------------
       0 |       121  .0866205  .3983875
       1 |       110  .1515468   .337737
----------------------------------------

.   matrix mbrib=r(Stat1)'\r(Stat2)'

.   tabstat underrp_ms [aw=w], by(credit) stats(n mean sd) save

Summary for variables: underrp_ms
     by categories of: credit (=1 if credit constrained)

  credit |         N      mean        sd
---------+------------------------------
       0 |       151  .1700946  .3530549
       1 |        80   .028336  .3841486
---------+------------------------------
   Total |       231  .1189524  .3700696
----------------------------------------

.   matrix mcred=r(Stat1)'\r(Stat2)'\r(StatTotal)'

.  
.   matrix munderrp_svy=mcity\msect\msize\mbrib\mcred

.   matrix list munderrp_svy

munderrp_svy[16,3]
                    N       mean         sd
underrp_ms        126  .09373032  .38037284
underrp_ms         50  .23143813    .299097
underrp_ms         38  .24705947   .2925276
underrp_ms         17  .09508545  .38583697
underrp_ms        125  .13311661  .31983811
underrp_ms         62  .05301167  .44353991
underrp_ms         30   .1411455  .26290607
underrp_ms         14  .37626103  .24813569
underrp_ms         57  .00940812   .3069934
underrp_ms        152   .1260716  .39341071
underrp_ms         22  .32653466  .19919031
underrp_ms        121  .08662053  .39838747
underrp_ms        110  .15154678  .33773703
underrp_ms        151  .17009465  .35305487
underrp_ms         80  .02833599  .38414864
underrp_ms        231  .11895245  .37006957

.   matrix drop mcity msect msize mbrib mcred 

.  
.    *** perform multiple group mean test ***
. foreach var of varlist city_no sector0 {
  2.      tabstat underrp_ms [aw=w], by(`var') stat(n mean sd) save
  3.          matrix rall=r(Stat1)'\r(Stat2)'\r(Stat3)'\r(Stat4)'
  4.          
.          forvalue i=1/4 {
  5.            local n`i'=rall[`i',1]
  6.            local m`i'=rall[`i',2]
  7.            local sd`i'=rall[`i',3] 
  8.          }
  9.          ttesti `n1' `m1' `sd1' `n2' `m2' `sd2',unequal
 10.          ttesti `n1' `m1' `sd1' `n3' `m3' `sd3',unequal
 11.          ttesti `n1' `m1' `sd1' `n4' `m4' `sd4',unequal 
 12.   }

Summary for variables: underrp_ms
     by categories of: city_no (city)

    city_no |         N      mean        sd
------------+------------------------------
Ulaanbaatar |       126  .0937303  .3803728
    Darkhan |        50  .2314381   .299097
    Edernet |        38  .2470595  .2925276
       Hovd |        17  .0950854   .385837
------------+------------------------------
      Total |       231  .1189524  .3700696
-------------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     126    .0937303    .0338863    .3803728    .0266651    .1607955
       y |      50    .2314381    .0422987     .299097    .1464357    .3164406
---------+--------------------------------------------------------------------
combined |     176    .1328519    .0274144    .3636932    .0787464    .1869573
---------+--------------------------------------------------------------------
    diff |           -.1377078    .0541984               -.2450772   -.0303384
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -2.5408
Ho: diff = 0                     Satterthwaite's degrees of freedom =  113.717

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0062         Pr(|T| > |t|) = 0.0124          Pr(T > t) = 0.9938

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     126    .0937303    .0338863    .3803728    .0266651    .1607955
       y |      38    .2470595    .0474542    .2925276     .150908    .3432109
---------+--------------------------------------------------------------------
combined |     164    .1292578    .0286472    .3668636    .0726903    .1858253
---------+--------------------------------------------------------------------
    diff |           -.1533291    .0583111               -.2694101   -.0372482
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -2.6295
Ho: diff = 0                     Satterthwaite's degrees of freedom =   78.326

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0051         Pr(|T| > |t|) = 0.0103          Pr(T > t) = 0.9949

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     126    .0937303    .0338863    .3803728    .0266651    .1607955
       y |      17    .0950854    .0935792     .385837   -.1032936    .2934645
---------+--------------------------------------------------------------------
combined |     143    .0938914    .0317482    .3796532    .0311312    .1566516
---------+--------------------------------------------------------------------
    diff |           -.0013551    .0995256               -.2086846    .2059744
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -0.0136
Ho: diff = 0                     Satterthwaite's degrees of freedom =  20.4262

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.4946         Pr(|T| > |t|) = 0.9893          Pr(T > t) = 0.5054

Summary for variables: underrp_ms
     by categories of: sector0 (industry)

     sector0 |         N      mean        sd
-------------+------------------------------
 manufacture |       125  .1331166  .3198381
construction |        62  .0530117  .4435399
     service |        30  .1411455  .2629061
     tourism |        14   .376261  .2481357
-------------+------------------------------
       Total |       231  .1189524  .3700696
--------------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     125    .1331166    .0286072    .3198381     .076495    .1897383
       y |      62    .0530117    .0563296    .4435399   -.0596264    .1656497
---------+--------------------------------------------------------------------
combined |     187    .1065578    .0267835    .3662589    .0537193    .1593963
---------+--------------------------------------------------------------------
    diff |            .0801049    .0631775               -.0453448    .2055547
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   1.2679
Ho: diff = 0                     Satterthwaite's degrees of freedom =  93.4649

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.8960         Pr(|T| > |t|) = 0.2080          Pr(T > t) = 0.1040

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     125    .1331166    .0286072    .3198381     .076495    .1897383
       y |      30    .1411455    .0479999    .2629061    .0429748    .2393162
---------+--------------------------------------------------------------------
combined |     155    .1346706    .0248083    .3088603    .0856622     .183679
---------+--------------------------------------------------------------------
    diff |           -.0080289    .0558781               -.1201702    .1041124
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -0.1437
Ho: diff = 0                     Satterthwaite's degrees of freedom =  51.7338

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.4432         Pr(|T| > |t|) = 0.8863          Pr(T > t) = 0.5568

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     125    .1331166    .0286072    .3198381     .076495    .1897383
       y |      14     .376261    .0663171    .2481357    .2329918    .5195303
---------+--------------------------------------------------------------------
combined |     139     .157606    .0272363     .321111    .1037516    .2114604
---------+--------------------------------------------------------------------
    diff |           -.2431444    .0722241               -.3947492   -.0915396
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -3.3665
Ho: diff = 0                     Satterthwaite's degrees of freedom =  18.2221

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0017         Pr(|T| > |t|) = 0.0034          Pr(T > t) = 0.9983

.         
. foreach var of varlist fsize2003 {
  2.      tabstat underrp_ms [aw=w], by(`var') stat(n mean sd) save
  3.          matrix rall=r(Stat1)'\r(Stat2)'\r(Stat3)'
  4.          
.          forvalue i=1/3 {
  5.            local n`i'=rall[`i',1]
  6.            local m`i'=rall[`i',2]
  7.            local sd`i'=rall[`i',3] 
  8.          }
  9.          ttesti `n1' `m1' `sd1' `n2' `m2' `sd2',unequal
 10.          ttesti `n1' `m1' `sd1' `n3' `m3' `sd3',unequal  
 11.   }     

Summary for variables: underrp_ms
     by categories of: fsize2003 (firm size)

       fsize2003 |         N      mean        sd
-----------------+------------------------------
 small < 10 emps |        57  .0094081  .3069934
medium 10 <= x < |       152  .1260716  .3934107
    large >= 100 |        22  .3265347  .1991903
-----------------+------------------------------
           Total |       231  .1189524  .3700696
------------------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |      57    .0094081    .0406623    .3069934   -.0720482    .0908644
       y |     152    .1260716    .0319098    .3934107    .0630242     .189119
---------+--------------------------------------------------------------------
combined |     209    .0942543    .0259227    .3747595    .0431495    .1453591
---------+--------------------------------------------------------------------
    diff |           -.1166635    .0516881               -.2189358   -.0143912
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -2.2571
Ho: diff = 0                     Satterthwaite's degrees of freedom =  128.183

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0128         Pr(|T| > |t|) = 0.0257          Pr(T > t) = 0.9872

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |      57    .0094081    .0406623    .3069934   -.0720482    .0908644
       y |      22    .3265347    .0424675    .1991903    .2382186    .4148507
---------+--------------------------------------------------------------------
combined |      79    .0977218    .0353664    .3143432    .0273128    .1681309
---------+--------------------------------------------------------------------
    diff |           -.3171265    .0587955               -.4347901    -.199463
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -5.3937
Ho: diff = 0                     Satterthwaite's degrees of freedom =  58.6651

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

.   
.   foreach var of varlist credit bribe{
  2.      tabstat underrp_ms [aw=w], by(`var') stat(n mean sd) save
  3.          matrix rall=r(Stat1)'\r(Stat2)'
  4.          
.          forvalue i=1/2 {
  5.            local n`i'=rall[`i',1]
  6.            local m`i'=rall[`i',2]
  7.            local sd`i'=rall[`i',3] 
  8.          }
  9.          ttesti `n1' `m1' `sd1' `n2' `m2' `sd2',unequal
 10.   }     

Summary for variables: underrp_ms
     by categories of: credit (=1 if credit constrained)

  credit |         N      mean        sd
---------+------------------------------
       0 |       151  .1700946  .3530549
       1 |        80   .028336  .3841486
---------+------------------------------
   Total |       231  .1189524  .3700696
----------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     151    .1700946    .0287312    .3530549    .1133245    .2268648
       y |      80     .028336    .0429491    .3841486   -.0571521    .1138241
---------+--------------------------------------------------------------------
combined |     231    .1210007    .0243129    .3695243    .0730962    .1689052
---------+--------------------------------------------------------------------
    diff |            .1417587    .0516731                 .039656    .2438613
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =   2.7434
Ho: diff = 0                     Satterthwaite's degrees of freedom =  149.734

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.9966         Pr(|T| > |t|) = 0.0068          Pr(T > t) = 0.0034

Summary for variables: underrp_ms
     by categories of: bribe (dummy,=1 if pay bribes)

   bribe |         N      mean        sd
---------+------------------------------
       0 |       121  .0866205  .3983875
       1 |       110  .1515468   .337737
---------+------------------------------
   Total |       231  .1189524  .3700696
----------------------------------------

Two-sample t test with unequal variances
------------------------------------------------------------------------------
         |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
       x |     121    .0866205     .036217    .3983875    .0149133    .1583278
       y |     110    .1515468     .032202     .337737    .0877235      .21537
---------+--------------------------------------------------------------------
combined |     231    .1175378    .0244347     .371376    .0693932    .1656823
---------+--------------------------------------------------------------------
    diff |           -.0649263    .0484628               -.1604186    .0305661
------------------------------------------------------------------------------
    diff = mean(x) - mean(y)                                      t =  -1.3397
Ho: diff = 0                     Satterthwaite's degrees of freedom =  227.915

    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
 Pr(T < t) = 0.0908         Pr(|T| > |t|) = 0.1817          Pr(T > t) = 0.9092

.   
.   *** ================
.   ***  PRODUCE TABLE 4
.   *** ================
.   *** matrix mTable4 provides the figures in Table 4 in the paper
.   matrix mTable4=munderrp_tax,munderrp_svy

.   matrix list mTable4

mTable4[16,6]
                   N       mean         sd          N       mean         sd
underrp_m        126  .36892146  .21418009        126  .09373032  .38037284
underrp_m         50  .46601492  .24663817         50  .23143813    .299097
underrp_m         38  .46154703  .22374982         38  .24705947   .2925276
underrp_m         17  .38014608  .20461612         17  .09508545  .38583697
underrp_m        125   .4236386  .24041919        125  .13311661  .31983811
underrp_m         62  .33878261  .18368172         62  .05301167  .44353991
underrp_m         30  .36721976   .2165334         30   .1411455  .26290607
underrp_m         14  .43661133  .24221489         14  .37626103  .24813569
underrp_m         57  .42780633  .24477972         57  .00940812   .3069934
underrp_m        152  .37389833  .21811515        152   .1260716  .39341071
underrp_m         22  .37639765  .16368221         22  .32653466  .19919031
underrp_m        121  .35262692  .20194266        121  .08662053  .39838747
underrp_m        110  .41947655  .23339499        110  .15154678  .33773703
underrp_m        151  .37843935  .21280587        151  .17009465  .35305487
underrp_m         80  .39916531  .23362562         80  .02833599  .38414864
underrp_m        231  .38591665  .22025159        231  .11895245  .37006957

.   *logout, save("$path2\table4") word replace
.   matrix drop munderrp_tax munderrp_svy

.   
.   xml_tab mTable4,save("$path2\table") sheet(table4) ///
>   title(Table 4 Mean % of Total Sales Underreported by MIMIC Approach) ///
>   rnames(Ulaanbaatar Darkan Erdenet Hovd Manufacture Construction Tourism ///
>   Service Small(<10) Medium(10-99) Large(>=100) nobribe bribe notconstraned constrained T
> otal) ///
>   notes(Note: figures are weighted by sampling weights; ///
>   denote significance of two-sample t-tests for equal means with different observations a
> nd variances at 10%\5% level by *\** respectively; ///
>   Source: calculated using MIMIC estimation results of Table 3 (column (4))) ///
>   font("Times New Roman" 12) append


note: results saved to D:\Dropbox\FIRSTPAPERDRAFTSANDPROGRAMS\CleanFiles4SubmissionUpload\T
> ABLES\table.xml

.   
.   *** ============================================================= 
.   *** COMPARE THE UNDERREPORTING BY 3 APPROACHES ON THE SAME SAMPLE
.   *** Produce Table 5 in the paper
.   *** =============================================================
.   *y_p: predicted sales by MIMIC model conditional on indicators and X
.   *y_d: sales inferred by underrp_d and sales reported to tax office
. 
.   g y_d=sale_t/(1-underrp_d)
(103 missing values generated)

.  
.   * with sampling weight, rmiss=0 ==> common sample
.   tabstat sale_t sale_s y_d y_p [aw=w] if rmiss==0,stats(sum) save

   stats |    sale_t    sale_s       y_d       y_p
---------+----------------------------------------
     sum |  6.31e+07  7.60e+07  1.12e+08  9.98e+07
--------------------------------------------------

.   matrix magg=r(StatTotal)

.  
.   *** CALCULATE AGGREGATE UNDERREPORTING FROM 3 APPROACHES ***
.   * with suffix _m ==> MIMIC approach
.   * with suffix _n ==> indirect approach
.   * with suffix _d ==> direct approach
.  
.   scalar aggundrp_m=1-magg[1,1]/magg[1,4]

.   scalar aggundrp_n=1-magg[1,1]/magg[1,2]

.   scalar aggundrp_d=1-magg[1,1]/magg[1,3]

.   scalar list aggundrp_m aggundrp_n aggundrp_d
aggundrp_m =  .36810043
aggundrp_n =  .16980374
aggundrp_d =  .43806205

.  
.   * matrix aggundrp ==> the last column of TABLE 5 in the paper
.   matrix aggundrp=aggundrp_m\aggundrp_n\aggundrp_d

.   
.   
.   *** Below Produce Columns 2-6 of Table 5 in the paper
.   *** ======================================================
.   *** Check distributions of underreporting across quantiles 
.   *** defined by survey sales for the 3 approaches
.   *** ======================================================
.   
.   * first define quantiles
.   tabstat lsale_s if rmiss==0,stats(p25 p50 p75) save

    variable |       p25       p50       p75
-------------+------------------------------
     lsale_s |  10.03452  10.93857  12.11382
--------------------------------------------

.   matrix mqn=r(StatTotal)

.   
.   g quartile=1 if lsale_s<=mqn[1,1]&rmiss==0
(317 missing values generated)

.   replace quartile=2 if lsale_s>mqn[1,1]&lsale_s<=mqn[2,1]&rmiss==0
(47 real changes made)

.   replace quartile=3 if lsale_s>mqn[2,1]&lsale_s<=mqn[3,1]&rmiss==0
(47 real changes made)

.   replace quartile=4 if lsale_s>mqn[3,1]&lsale_s~=.&rmiss==0
(46 real changes made)

.   
.   * second summarize firm-level mean of underreporting by quartiles defined above
.   tabstat underrp_d [aw = w] if rmiss==0, stat(mean) by(quartile) save

Summary for variables: underrp_d
     by categories of: quartile 

quartile |      mean
---------+----------
       1 |   .314864
       2 |  .3546698
       3 |  .3303067
       4 |  .3849107
---------+----------
   Total |  .3504845
--------------------

.   matrix undrpbyqrt_d=r(Stat1),r(Stat2),r(Stat3),r(Stat4),r(StatTotal)

.   
.   tabstat underrp_n [aw = w] if rmiss==0, stat(mean) by(quartile) save

Summary for variables: underrp_n
     by categories of: quartile 

quartile |      mean
---------+----------
       1 |  .1611441
       2 |  .1441582
       3 |  .0429392
       4 |  .2132794
---------+----------
   Total |  .1446992
--------------------

.   matrix undrpbyqrt_n=r(Stat1),r(Stat2),r(Stat3),r(Stat4),r(StatTotal)

.   
.   tabstat underrp_m [aw = w] if rmiss==0, stat(mean) by(quartile) save

Summary for variables: underrp_m
     by categories of: quartile 

quartile |      mean
---------+----------
       1 |  .3308527
       2 |  .3946168
       3 |  .3395808
       4 |   .425583
---------+----------
   Total |  .3787979
--------------------

.   matrix undrpbyqrt_m=r(Stat1),r(Stat2),r(Stat3),r(Stat4),r(StatTotal)

.  
.   tabstat underrp_m [aw = w] if rmiss==0, stat(n) by(quartile) save

Summary for variables: underrp_m
     by categories of: quartile 

quartile |         N
---------+----------
       1 |        47
       2 |        47
       3 |        47
       4 |        46
---------+----------
   Total |       187
--------------------

.   matrix mcount=r(Stat1),r(Stat2),r(Stat3),r(Stat4),r(StatTotal)

.   
.   
.   
.   * third count the number of firms underreport in each quartile for the 3 approaches
.   g index1=(underrp_d>0&underrp_d~=.)

.   g index2=(lsale_s>lsale_t&lsale_s~=.)

.   g index3=(y_p>lsale_t&y_p~=.)

.   
.   * direct approach
.   tabstat index1 if rmiss==0, stat(sum) by(quartile) save

Summary for variables: index1
     by categories of: quartile 

quartile |       sum
---------+----------
       1 |        33
       2 |        39
       3 |        34
       4 |        37
---------+----------
   Total |       143
--------------------

.   matrix cn1 = r(Stat1)\r(Stat2)\r(Stat3)\r(Stat4)\r(StatTotal)

.   
.   * indirect approach
.   tabstat index2 if rmiss==0, stat(sum) by(quartile) save

Summary for variables: index2
     by categories of: quartile 

quartile |       sum
---------+----------
       1 |        28
       2 |        28
       3 |        25
       4 |        26
---------+----------
   Total |       107
--------------------

.   matrix cn2 = r(Stat1)\r(Stat2)\r(Stat3)\r(Stat4)\r(StatTotal)

.   
.   * MIMIC approach
.   tabstat index3 if rmiss==0, stat(sum) by(quartile) save

Summary for variables: index3
     by categories of: quartile 

quartile |       sum
---------+----------
       1 |        47
       2 |        47
       3 |        47
       4 |        46
---------+----------
   Total |       187
--------------------

.   matrix cn3 = r(Stat1)\r(Stat2)\r(Stat3)\r(Stat4)\r(StatTotal)

.   
.   matrix cnall=cn1'\cn2'\cn3'\mcount

.   *matrix list cnall
.   
.   * create share of underreporting firms in each quartile
.   matrix mpct=J(4,5,0)

.   forvalues i = 1/4 {
  2.            forvalues j = 1/5 {
  3.                   matrix mpct[`i',`j']= 100*cnall[`i',`j']/cnall[4,`j']
  4.            }
  5.         }

.   matrix list mpct

mpct[4,5]
           c1         c2         c3         c4         c5
r1  70.212766  82.978723  72.340426  80.434783  76.470588
r2  59.574468  59.574468  53.191489  56.521739  57.219251
r3        100        100        100        100        100
r4        100        100        100        100        100

.   matrix drop cn1 cn2 cn3 cnall

.   drop index*

.   
.   *** ===============
.   *** PRODUCE TABLE 5 
.   *** ===============
.   *** NOW produce columns 2-6 of TABLE 5 in the paper (mTable5)
.   *** matrix aggundrp is column 7 of TABLE 5!
.   
.   matrix mTable5=undrpbyqrt_d\mpct[1,1..5]\undrpbyqrt_n\mpct[2,1..5] ///
>                  \undrpbyqrt_m\mpct[3,1..5]\mcount

.   matrix list mTable5

mTable5[7,5]
      underrp_d  underrp_d  underrp_d  underrp_d  underrp_d
mean  .31486402  .35466984  .33030672  .38491067  .35048452
  r1  70.212766  82.978723  72.340426  80.434783  76.470588
mean  .16114407   .1441582  .04293925  .21327937  .14469915
  r2  59.574468  59.574468  53.191489  56.521739  57.219251
mean  .33085267  .39461682  .33958083  .42558303  .37879794
  r3        100        100        100        100        100
   N         47         47         47         46        187

.   *logout, save("$path2\table5") word replace
.   
.   xml_tab mTable5,save("$path2\table") sheet(table5) ///
>   title(Table 5 Comparison of Three Approaches  to Measuring Underreporting ///
>   (% of firms underreporting in each quantile in brackets)) ///
>   rnames(Direct Indirect MIMIC N) ///
>   notes(Note: quantiles are defined for sales reported in the survey; ///
>   all figures are weighted by sampling weights) ///
>   font("Times New Roman" 12) append


note: results saved to D:\Dropbox\FIRSTPAPERDRAFTSANDPROGRAMS\CleanFiles4SubmissionUpload\T
> ABLES\table.xml

.   
.   matrix list aggundrp

aggundrp[3,1]
           c1
r1  .36810043
r2  .16980374
r3  .43806205

.   * ==================================================
.   * check for any sample selection bias for TABLE 5!!!
.   * ==================================================
.   * for footnote 39 in the paper
.   * are the different results from the 3 approaches driven by sample selection effect??
.   * indexd = 1 if all 3 underreporting measures are nonmissing (belong to the 186 group) 
.   * indexd = 0 otherwise 
.   * coefficient for indexd nonsiginficant indicates no sample selection issue 
.                 
.   g indexd = 1 if rmiss == 0 
(177 missing values generated)

.   replace indexd = 0 if rmiss != 0 
(177 real changes made)

.   reg underrp_d indexd 

      Source |       SS       df       MS              Number of obs =     297
-------------+------------------------------           F(  1,   295) =    0.33
       Model |  .026444279     1  .026444279           Prob > F      =  0.5690
    Residual |  23.9906287   295  .081324165           R-squared     =  0.0011
-------------+------------------------------           Adj R-squared = -0.0023
       Total |   24.017073   296   .08113876           Root MSE      =  .28517

------------------------------------------------------------------------------
   underrp_d |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      indexd |  -.0195401   .0342666    -0.57   0.569    -.0869781    .0478978
       _cons |       .382   .0271903    14.05   0.000     .3284885    .4355115
------------------------------------------------------------------------------

.   reg underrp_n indexd 

      Source |       SS       df       MS              Number of obs =     259
-------------+------------------------------           F(  1,   257) =    0.10
       Model |  .023564858     1  .023564858           Prob > F      =  0.7486
    Residual |  58.8328582   257  .228921627           R-squared     =  0.0004
-------------+------------------------------           Adj R-squared = -0.0035
       Total |  58.8564231   258  .228125671           Root MSE      =  .47846

------------------------------------------------------------------------------
   underrp_n |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      indexd |    .021291     .06636     0.32   0.749    -.1093876    .1519695
       _cons |   .1547029   .0563868     2.74   0.007     .0436639    .2657418
------------------------------------------------------------------------------

.   reg underrp_m indexd

      Source |       SS       df       MS              Number of obs =     231
-------------+------------------------------           F(  1,   229) =    0.38
       Model |   .01982602     1   .01982602           Prob > F      =  0.5386
    Residual |  11.9700825   229  .052271103           R-squared     =  0.0017
-------------+------------------------------           Adj R-squared = -0.0027
       Total |  11.9899086   230  .052130037           Root MSE      =  .22863

------------------------------------------------------------------------------
   underrp_m |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      indexd |  -.0235926    .038308    -0.62   0.539    -.0990739    .0518886
       _cons |   .4377505   .0344671    12.70   0.000     .3698374    .5056637
------------------------------------------------------------------------------

.         
.   *** ==========================
.   ***  DO DESCRPTIVE REGRESSIONS 
.   *** ==========================
.   * for footnote 42 in the paper, check which measure has most explantory power!
.   * conclusion is the MIMIC model based on R-square.
.  
.   * direct approach
.   reg underrp_d industryd2 industryd3 industryd4 cityd2 cityd3 cityd5 if rmiss==0,robust

Linear regression                                      Number of obs =     187
                                                       F(  6,   180) =    0.70
                                                       Prob > F      =  0.6499
                                                       R-squared     =  0.0224
                                                       Root MSE      =  .28219

------------------------------------------------------------------------------
             |               Robust
   underrp_d |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  industryd2 |  -.0186824   .0481034    -0.39   0.698    -.1136015    .0762368
  industryd3 |   .0017019   .0624223     0.03   0.978    -.1214717    .1248755
  industryd4 |  -.0116756   .0915348    -0.13   0.899    -.1922949    .1689437
      cityd2 |  -.0024578   .0496381    -0.05   0.961    -.1004051    .0954896
      cityd3 |   .0849003   .0566139     1.50   0.135     -.026812    .1966127
      cityd5 |      .1147   .0869228     1.32   0.189    -.0568187    .2862188
       _cons |   .3446172    .036345     9.48   0.000        .2729    .4163343
------------------------------------------------------------------------------

.   ereturn list

scalars:
                  e(N) =  187
               e(df_m) =  6
               e(df_r) =  180
                  e(F) =  .7000069688531641
                 e(r2) =  .0224315699559858
               e(rmse) =  .282193368438543
                e(mss) =  .3289111640661204
                e(rss) =  14.33395749432443
               e(r2_a) =  -.0101540443788146
                 e(ll) =  -25.18888333341598
               e(ll_0) =  -27.31011637600779
               e(rank) =  7

macros:
            e(cmdline) : "regress underrp_d industryd2 industryd3 industryd4 cityd2 cit.."
              e(title) : "Linear regression"
          e(marginsok) : "XB default"
                e(vce) : "robust"
             e(depvar) : "underrp_d"
                e(cmd) : "regress"
         e(properties) : "b V"
            e(predict) : "regres_p"
              e(model) : "ols"
          e(estat_cmd) : "regress_estat"
            e(vcetype) : "Robust"

matrices:
                  e(b) :  1 x 7
                  e(V) :  7 x 7
       e(V_modelbased) :  7 x 7

functions:
             e(sample)   

.   scalar r2a=e(r2)

.   
.   reg underrp_d industryd2 industryd3 industryd4 cityd2 cityd3 cityd5  ///
>       sized2 sized3 lmwage lksflow manexp bribe credit if rmiss==0, robust

Linear regression                                      Number of obs =     187
                                                       F( 13,   173) =    2.09
                                                       Prob > F      =  0.0170
                                                       R-squared     =  0.1144
                                                       Root MSE      =  .27397

------------------------------------------------------------------------------
             |               Robust
   underrp_d |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  industryd2 |  -.0835918   .0532062    -1.57   0.118    -.1886087    .0214251
  industryd3 |  -.0390288   .0680174    -0.57   0.567    -.1732796    .0952219
  industryd4 |  -.0168289   .0782044    -0.22   0.830    -.1711865    .1375287
      cityd2 |   .0012765   .0521312     0.02   0.980    -.1016187    .1041716
      cityd3 |   .0953925    .056439     1.69   0.093    -.0160052    .2067902
      cityd5 |   .1626265   .0893791     1.82   0.071    -.0137874    .3390404
      sized2 |  -.0047341   .0526649    -0.09   0.928    -.1086825    .0992144
      sized3 |   .0235039   .0971353     0.24   0.809     -.168219    .2152267
      lmwage |   .0820112   .0257385     3.19   0.002     .0312093    .1328131
     lksflow |  -.0168779   .0112704    -1.50   0.136    -.0391231    .0053673
      manexp |   .0041654   .0167248     0.25   0.804    -.0288456    .0371764
       bribe |   .1326592   .0433618     3.06   0.003     .0470729    .2182455
      credit |  -.0002354   .0450626    -0.01   0.996    -.0891788     .088708
       _cons |  -.0971465   .1799001    -0.54   0.590    -.4522281    .2579351
------------------------------------------------------------------------------

.   ereturn list

scalars:
                  e(N) =  187
               e(df_m) =  13
               e(df_r) =  173
                  e(F) =  2.087284203454001
                 e(r2) =  .1144220661494758
               e(rmse) =  .2739679839826349
                e(mss) =  1.677755727571439
                e(rss) =  12.98511293081911
               e(r2_a) =  .0478757474208238
                 e(ll) =  -15.94848121953316
               e(ll_0) =  -27.31011637600779
               e(rank) =  14

macros:
            e(cmdline) : "regress underrp_d industryd2 industryd3 industryd4 cityd2 cit.."
              e(title) : "Linear regression"
          e(marginsok) : "XB default"
                e(vce) : "robust"
             e(depvar) : "underrp_d"
                e(cmd) : "regress"
         e(properties) : "b V"
            e(predict) : "regres_p"
              e(model) : "ols"
          e(estat_cmd) : "regress_estat"
            e(vcetype) : "Robust"

matrices:
                  e(b) :  1 x 14
                  e(V) :  14 x 14
       e(V_modelbased) :  14 x 14

functions:
             e(sample)   

.   scalar r2b=e(r2)

.   
.   * for underrp_d>0
.   reg underrp_d industryd2 industryd3 industryd4 cityd2 cityd3 cityd5 if rmiss==0&underrp
> _d~=0,robust

Linear regression                                      Number of obs =     143
                                                       F(  6,   136) =    0.61
                                                       Prob > F      =  0.7244
                                                       R-squared     =  0.0270
                                                       Root MSE      =  .22544

------------------------------------------------------------------------------
             |               Robust
   underrp_d |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  industryd2 |  -.0366283   .0457929    -0.80   0.425    -.1271865      .05393
  industryd3 |   .0106778   .0558248     0.19   0.849    -.0997191    .1210747
  industryd4 |   .0564083   .0809973     0.70   0.487    -.1037689    .2165854
      cityd2 |  -.0246419   .0406154    -0.61   0.545    -.1049614    .0556776
      cityd3 |  -.0065416   .0549965    -0.12   0.905    -.1153005    .1022173
      cityd5 |   .0913562    .080097     1.14   0.256    -.0670406     .249753
       _cons |   .4772948   .0324083    14.73   0.000     .4132054    .5413843
------------------------------------------------------------------------------

.   ereturn list

scalars:
                  e(N) =  143
               e(df_m) =  6
               e(df_r) =  136
                  e(F) =  .6070150796554247
                 e(r2) =  .0270186130363125
               e(rmse) =  .2254358174426238
                e(mss) =  .1919301770574142
                e(rss) =  6.911697858899268
               e(r2_a) =  -.0159070363885563
                 e(ll) =  13.71028557811323
               e(ll_0) =  11.75187723244609
               e(rank) =  7

macros:
            e(cmdline) : "regress underrp_d industryd2 industryd3 industryd4 cityd2 cit.."
              e(title) : "Linear regression"
          e(marginsok) : "XB default"
                e(vce) : "robust"
             e(depvar) : "underrp_d"
                e(cmd) : "regress"
         e(properties) : "b V"
            e(predict) : "regres_p"
              e(model) : "ols"
          e(estat_cmd) : "regress_estat"
            e(vcetype) : "Robust"

matrices:
                  e(b) :  1 x 7
                  e(V) :  7 x 7
       e(V_modelbased) :  7 x 7

functions:
             e(sample)   

.   scalar r2c=e(r2)

.   reg underrp_d industryd2 industryd3 industryd4 cityd2 cityd3 cityd5 ///
>       sized2 sized3 lmwage lksflow manexp bribe credit if rmiss==0&underrp_d~=0, robust 

Linear regression                                      Number of obs =     143
                                                       F( 13,   129) =    0.75
                                                       Prob > F      =  0.7120
                                                       R-squared     =  0.0675
                                                       Root MSE      =   .2266

------------------------------------------------------------------------------
             |               Robust
   underrp_d |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  industryd2 |  -.0684997   .0549489    -1.25   0.215    -.1772175    .0402182
  industryd3 |  -.0215604   .0629648    -0.34   0.733    -.1461377     .103017
  industryd4 |   .0662664   .0803369     0.82   0.411    -.0926821    .2252148
      cityd2 |  -.0125618   .0456195    -0.28   0.783     -.102821    .0776975
      cityd3 |   .0009616    .061245     0.02   0.987    -.1202132    .1221364
      cityd5 |   .0788719   .0837876     0.94   0.348    -.0869039    .2446476
      sized2 |  -.0198883   .0493476    -0.40   0.688    -.1175238    .0777471
      sized3 |   .0459311   .0846642     0.54   0.588    -.1215791    .2134414
      lmwage |   .0412163    .032394     1.27   0.206     -.022876    .1053086
     lksflow |  -.0176935   .0106645    -1.66   0.100    -.0387935    .0034065
      manexp |   .0081702   .0153229     0.53   0.595    -.0221466    .0384871
       bribe |  -.0094274   .0399039    -0.24   0.814    -.0883783    .0695235
      credit |   .0053319   .0422047     0.13   0.900    -.0781711     .088835
       _cons |   .3722135   .2401479     1.55   0.124     -.102925     .847352
------------------------------------------------------------------------------

.   ereturn list

scalars:
                  e(N) =  143
               e(df_m) =  13
               e(df_r) =  129
                  e(F) =  .7487092053838407
                 e(r2) =  .0675327828988027
               e(rmse) =  .2266011122602917
                e(mss) =  .4797277699461109
                e(rss) =  6.623900266010571
               e(r2_a) =  -.0264367816152715
                 e(ll) =  16.75124903886074
               e(ll_0) =  11.75187723244609
               e(rank) =  14

macros:
            e(cmdline) : "regress underrp_d industryd2 industryd3 industryd4 cityd2 cit.."
              e(title) : "Linear regression"
          e(marginsok) : "XB default"
                e(vce) : "robust"
             e(depvar) : "underrp_d"
                e(cmd) : "regress"
         e(properties) : "b V"
            e(predict) : "regres_p"
              e(model) : "ols"
          e(estat_cmd) : "regress_estat"
            e(vcetype) : "Robust"

matrices:
                  e(b) :  1 x 14
                  e(V) :  14 x 14
       e(V_modelbased) :  14 x 14

functions:
             e(sample)   

.   scalar r2d=e(r2)

.   
.   scalar r2ba=r2b-r2a

.   scalar r2dc=r2d-r2c

.   
.   * MIMIC approach
.   * Descriptive regression for footnote 37 in the paper!
.   reg underrp_m industryd2 industryd3 industryd4 cityd2 cityd3 cityd5 if rmiss==0, robust

Linear regression                                      Number of obs =     187
                                                       F(  6,   180) =    2.34
                                                       Prob > F      =  0.0338
                                                       R-squared     =  0.0743
                                                       Root MSE      =  .20701

------------------------------------------------------------------------------
             |               Robust
   underrp_m |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  industryd2 |  -.0457968   .0345995    -1.32   0.187    -.1140695     .022476
  industryd3 |  -.0107309   .0505795    -0.21   0.832    -.1105359    .0890741
  industryd4 |   .0560255   .0481654     1.16   0.246     -.039016    .1510669
      cityd2 |   .1367291   .0475409     2.88   0.005     .0429198    .2305383
      cityd3 |   .0824305   .0380291     2.17   0.032     .0073903    .1574707
      cityd5 |   .0484807   .0575041     0.84   0.400    -.0649882    .1619496
       _cons |   .3819292   .0270189    14.14   0.000     .3286148    .4352437
------------------------------------------------------------------------------

.   ereturn list

scalars:
                  e(N) =  187
               e(df_m) =  6
               e(df_r) =  180
                  e(F) =  2.336342094356349
                 e(r2) =  .0743155491694893
               e(rmse) =  .2070133278072928
                e(mss) =  .61927827040165
                e(rss) =  7.713813220172944
               e(r2_a) =  .0434594008084723
                 e(ll) =  32.74546752857515
               e(ll_0) =  25.52522284704946
               e(rank) =  7

macros:
            e(cmdline) : "regress underrp_m industryd2 industryd3 industryd4 cityd2 cit.."
              e(title) : "Linear regression"
          e(marginsok) : "XB default"
                e(vce) : "robust"
             e(depvar) : "underrp_m"
                e(cmd) : "regress"
         e(properties) : "b V"
            e(predict) : "regres_p"
              e(model) : "ols"
          e(estat_cmd) : "regress_estat"
            e(vcetype) : "Robust"

matrices:
                  e(b) :  1 x 7
                  e(V) :  7 x 7
       e(V_modelbased) :  7 x 7

functions:
             e(sample)   

.   scalar r2e=e(r2)

.   reg underrp_m industryd2 industryd3 industryd4 cityd2 cityd3 cityd5 ///
>       sized2 sized3 lmwage lksflow manexp bribe credit if rmiss==0, robust

Linear regression                                      Number of obs =     187
                                                       F( 13,   173) =    3.63
                                                       Prob > F      =  0.0000
                                                       R-squared     =  0.1681
                                                       Root MSE      =  .20018

------------------------------------------------------------------------------
             |               Robust
   underrp_m |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  industryd2 |  -.0815484   .0382039    -2.13   0.034    -.1569542   -.0061427
  industryd3 |   .0003674   .0473529     0.01   0.994    -.0930963    .0938312
  industryd4 |   .0414385   .0494911     0.84   0.404    -.0562457    .1391227
      cityd2 |   .1326249    .044562     2.98   0.003     .0446698      .22058
      cityd3 |   .0460571   .0416852     1.10   0.271    -.0362199     .128334
      cityd5 |   .0529623   .0560497     0.94   0.346    -.0576669    .1635915
      sized2 |   .0269251   .0387687     0.69   0.488    -.0495954    .1034456
      sized3 |   .0458416   .0644119     0.71   0.478    -.0812928     .172976
      lmwage |   -.016028   .0156263    -1.03   0.306    -.0468707    .0148147
     lksflow |  -.0149354   .0086555    -1.73   0.086    -.0320193    .0021485
      manexp |  -.0037439    .012445    -0.30   0.764    -.0283075    .0208197
       bribe |   .0934763   .0321047     2.91   0.004     .0301089    .1568437
      credit |   .0820149   .0328621     2.50   0.014     .0171525    .1468772
       _cons |   .5401563   .1102477     4.90   0.000     .3225525    .7577601
------------------------------------------------------------------------------

.   ereturn list

scalars:
                  e(N) =  187
               e(df_m) =  13
               e(df_r) =  173
                  e(F) =  3.625334117951798
                 e(r2) =  .1680533841185503
               e(rmse) =  .2001832580269909
                e(mss) =  1.400404225160555
                e(rss) =  6.932687265414039
               e(r2_a) =  .1055371644280367
                 e(ll) =  42.72800770454923
               e(ll_0) =  25.52522284704946
               e(rank) =  14

macros:
            e(cmdline) : "regress underrp_m industryd2 industryd3 industryd4 cityd2 cit.."
              e(title) : "Linear regression"
          e(marginsok) : "XB default"
                e(vce) : "robust"
             e(depvar) : "underrp_m"
                e(cmd) : "regress"
         e(properties) : "b V"
            e(predict) : "regres_p"
              e(model) : "ols"
          e(estat_cmd) : "regress_estat"
            e(vcetype) : "Robust"

matrices:
                  e(b) :  1 x 14
                  e(V) :  14 x 14
       e(V_modelbased) :  14 x 14

functions:
             e(sample)   

.   scalar r2f=e(r2)

.   
.  reg underrp_m industryd2 industryd3 industryd4 cityd2 cityd3 cityd5 if rmiss==0&underrp_
> d~=0,robust

Linear regression                                      Number of obs =     143
                                                       F(  6,   136) =    2.03
                                                       Prob > F      =  0.0654
                                                       R-squared     =  0.0813
                                                       Root MSE      =  .21128

------------------------------------------------------------------------------
             |               Robust
   underrp_m |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  industryd2 |  -.0285917   .0400818    -0.71   0.477     -.107856    .0506725
  industryd3 |  -.0189575   .0578292    -0.33   0.744    -.1333182    .0954033
  industryd4 |   .0800625   .0625571     1.28   0.203     -.043648    .2037729
      cityd2 |    .146986    .054858     2.68   0.008     .0385011     .255471
      cityd3 |   .0866093   .0409544     2.11   0.036     .0056194    .1675992
      cityd5 |     .04888   .0690499     0.71   0.480    -.0876704    .1854304
       _cons |   .3929219   .0321348    12.23   0.000     .3293734    .4564703
------------------------------------------------------------------------------

.   ereturn list

scalars:
                  e(N) =  143
               e(df_m) =  6
               e(df_r) =  136
                  e(F) =  2.032882626367673
                 e(r2) =  .0813006663033522
               e(rmse) =  .2112831552982333
                e(mss) =  .5372660025058078
                e(rss) =  6.071117752937722
               e(r2_a) =  .0407698133461473
                 e(ll) =  22.98187548226069
               e(ll_0) =  16.91893452872064
               e(rank) =  7

macros:
            e(cmdline) : "regress underrp_m industryd2 industryd3 industryd4 cityd2 cit.."
              e(title) : "Linear regression"
          e(marginsok) : "XB default"
                e(vce) : "robust"
             e(depvar) : "underrp_m"
                e(cmd) : "regress"
         e(properties) : "b V"
            e(predict) : "regres_p"
              e(model) : "ols"
          e(estat_cmd) : "regress_estat"
            e(vcetype) : "Robust"

matrices:
                  e(b) :  1 x 7
                  e(V) :  7 x 7
       e(V_modelbased) :  7 x 7

functions:
             e(sample)   

.   scalar r2g=e(r2)

.   reg underrp_m industryd2 industryd3 industryd4 cityd2 cityd3 cityd5 ///
>       sized2 sized3 lmwage lksflow manexp bribe credit if rmiss==0&underrp_d~=0, robust 

Linear regression                                      Number of obs =     143
                                                       F( 13,   129) =    2.44
                                                       Prob > F      =  0.0055
                                                       R-squared     =  0.1520
                                                       Root MSE      =  .20842

------------------------------------------------------------------------------
             |               Robust
   underrp_m |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  industryd2 |  -.0599091   .0455851    -1.31   0.191    -.1501003    .0302821
  industryd3 |   .0075839   .0558362     0.14   0.892    -.1028894    .1180572
  industryd4 |   .0590702   .0592741     1.00   0.321     -.058205    .1763454
      cityd2 |   .1356907   .0533949     2.54   0.012     .0300476    .2413339
      cityd3 |   .0630019   .0459724     1.37   0.173    -.0279555    .1539593
      cityd5 |   .0533251   .0741682     0.72   0.473    -.0934186    .2000688
      sized2 |   .0557254    .048377     1.15   0.251    -.0399898    .1514405
      sized3 |   .0601393   .0847924     0.71   0.479    -.1076245    .2279032
      lmwage |  -.0188909   .0225649    -0.84   0.404    -.0635361    .0257544
     lksflow |  -.0112221    .010564    -1.06   0.290    -.0321232    .0096791
      manexp |  -.0095198   .0149283    -0.64   0.525    -.0390558    .0200163
       bribe |    .074318    .040453     1.84   0.068    -.0057191    .1543552
      credit |   .0783775   .0396865     1.97   0.050    -.0001431    .1568982
       _cons |   .5274787   .1734678     3.04   0.003     .1842683     .870689
------------------------------------------------------------------------------

.   ereturn list

scalars:
                  e(N) =  143
               e(df_m) =  13
               e(df_r) =  129
                  e(F) =  2.441884806962813
                 e(r2) =  .1520346936459137
               e(rmse) =  .2084211626775477
                e(mss) =  1.00470359975349
                e(rss) =  5.60368015569004
               e(r2_a) =  .0665808255637189
                 e(ll) =  28.71039680742103
               e(ll_0) =  16.91893452872064
               e(rank) =  14

macros:
            e(cmdline) : "regress underrp_m industryd2 industryd3 industryd4 cityd2 cit.."
              e(title) : "Linear regression"
          e(marginsok) : "XB default"
                e(vce) : "robust"
             e(depvar) : "underrp_m"
                e(cmd) : "regress"
         e(properties) : "b V"
            e(predict) : "regres_p"
              e(model) : "ols"
          e(estat_cmd) : "regress_estat"
            e(vcetype) : "Robust"

matrices:
                  e(b) :  1 x 14
                  e(V) :  14 x 14
       e(V_modelbased) :  14 x 14

functions:
             e(sample)   

.   scalar r2h=e(r2)  

.   
.   scalar r2fe=r2f-r2e

.   scalar r2hg=r2h-r2g

.   
.   
.   scalar list r2a r2b r2ba
       r2a =  .02243157
       r2b =  .11442207
      r2ba =   .0919905

.   scalar list r2c r2d r2dc
       r2c =  .02701861
       r2d =  .06753278
      r2dc =  .04051417

.   scalar list r2e r2f r2fe
       r2e =  .07431555
       r2f =  .16805338
      r2fe =  .09373783

.   scalar list r2g r2h r2hg
       r2g =  .08130067
       r2h =  .15203469
      r2hg =  .07073403

.   scalar list r2ba r2fe r2dc r2hg
      r2ba =   .0919905
      r2fe =  .09373783
      r2dc =  .04051417
      r2hg =  .07073403

.   
.   * Indirect approach
.   reg underrp_n industryd2 industryd3 industryd4 cityd2 cityd3 cityd5 if rmiss==0, robust

Linear regression                                      Number of obs =     187
                                                       F(  6,   180) =    1.03
                                                       Prob > F      =  0.4081
                                                       R-squared     =  0.0124
                                                       Root MSE      =  .45756

------------------------------------------------------------------------------
             |               Robust
   underrp_n |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  industryd2 |   .0426052   .0796624     0.53   0.593     -.114587    .1997975
  industryd3 |   .0209006   .1004882     0.21   0.835    -.1773857     .219187
  industryd4 |  -.1105321   .0810724    -1.36   0.174    -.2705066    .0494424
      cityd2 |   .0518967   .1114808     0.47   0.642    -.1680806    .2718741
      cityd3 |   .0018564   .0883868     0.02   0.983    -.1725511     .176264
      cityd5 |   .0938572   .1015678     0.92   0.357    -.1065595     .294274
       _cons |   .1506754   .0603367     2.50   0.013     .0316172    .2697336
------------------------------------------------------------------------------

.   reg underrp_n industryd2 industryd3 industryd4 cityd2 cityd3 cityd5 ///
>       sized2 sized3 lmwage lksflow manexp bribe credit if rmiss==0, robust  

Linear regression                                      Number of obs =     187
                                                       F( 13,   173) =    5.23
                                                       Prob > F      =  0.0000
                                                       R-squared     =  0.2408
                                                       Root MSE      =  .40923

------------------------------------------------------------------------------
             |               Robust
   underrp_n |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  industryd2 |  -.0319179   .0810295    -0.39   0.694    -.1918516    .1280158
  industryd3 |   .0012645   .0941801     0.01   0.989    -.1846256    .1871546
  industryd4 |   -.166703   .1006638    -1.66   0.100    -.3653904    .0319843
      cityd2 |   .0573007   .0966642     0.59   0.554    -.1334924    .2480938
      cityd3 |  -.0510598   .0851894    -0.60   0.550    -.2192042    .1170847
      cityd5 |   .0023264   .0982987     0.02   0.981    -.1916927    .1963454
      sized2 |   .0379528   .0781343     0.49   0.628    -.1162664     .192172
      sized3 |   .0446834   .1417292     0.32   0.753    -.2350576    .3244243
      lmwage |    .075679    .032251     2.35   0.020     .0120229    .1393352
     lksflow |  -.0945501   .0210355    -4.49   0.000    -.1360693   -.0530309
      manexp |  -.0440088   .0260584    -1.69   0.093    -.0954422    .0074246
       bribe |   .0669757   .0663834     1.01   0.314    -.0640499    .1980014
      credit |   .2040968   .0630824     3.24   0.001     .0795865    .3286071
       _cons |   .4605461   .2562036     1.80   0.074    -.0451413    .9662334
------------------------------------------------------------------------------

.   
.   * test for the difference of underreporting between direct approach and MIMIC approach
.   preserve

.   keep if rmiss==0
(177 observations deleted)

.   g diff=underrp_d-underrp_m

.   reg diff industryd2 industryd3 industryd4 cityd2 cityd3 cityd5 ///
>       sized2 sized3 lmwage lksflow manexp bribe credit, robust 

Linear regression                                      Number of obs =     187
                                                       F( 13,   173) =    1.88
                                                       Prob > F      =  0.0349
                                                       R-squared     =  0.0999
                                                       Root MSE      =  .31414

------------------------------------------------------------------------------
             |               Robust
        diff |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  industryd2 |  -.0020434   .0605675    -0.03   0.973    -.1215897     .117503
  industryd3 |  -.0393963   .0757277    -0.52   0.604    -.1888655     .110073
  industryd4 |  -.0582674   .0853764    -0.68   0.496     -.226781    .1102462
      cityd2 |  -.1313484    .069099    -1.90   0.059     -.267734    .0050372
      cityd3 |   .0493354   .0596612     0.83   0.409    -.0684221    .1670929
      cityd5 |   .1096642   .0820709     1.34   0.183     -.052325    .2716534
      sized2 |  -.0316592   .0640654    -0.49   0.622    -.1581097    .0947913
      sized3 |  -.0223378   .1181343    -0.19   0.850    -.2555079    .2108324
      lmwage |   .0980392   .0281592     3.48   0.001     .0424594     .153619
     lksflow |  -.0019425   .0111817    -0.17   0.862    -.0240126    .0201276
      manexp |   .0079093    .019107     0.41   0.679    -.0298035    .0456221
       bribe |   .0391829   .0479915     0.82   0.415    -.0555414    .1339072
      credit |  -.0822503   .0541945    -1.52   0.131    -.1892179    .0247174
       _cons |  -.6373028   .2015665    -3.16   0.002    -1.035149   -.2394565
------------------------------------------------------------------------------

.   test industryd2 industryd3 industryd4 cityd2 cityd3 cityd5

 ( 1)  industryd2 = 0
 ( 2)  industryd3 = 0
 ( 3)  industryd4 = 0
 ( 4)  cityd2 = 0
 ( 5)  cityd3 = 0
 ( 6)  cityd5 = 0

       F(  6,   173) =    1.30
            Prob > F =    0.2591

.   test sized2 sized3 lmwage lksflow manexp bribe credit

 ( 1)  sized2 = 0
 ( 2)  sized3 = 0
 ( 3)  lmwage = 0
 ( 4)  lksflow = 0
 ( 5)  manexp = 0
 ( 6)  bribe = 0
 ( 7)  credit = 0

       F(  7,   173) =    2.46
            Prob > F =    0.0198

.   restore

.   
.   drop rmiss indexd

.   log close
      name:  <unnamed>
       log:  D:\Dropbox\FIRSTPAPERDRAFTSANDPROGRAMS\CleanFiles4SubmissionUpload\LOGS\logcom
> pare3approaches.log
  log type:  text
 closed on:  28 Mar 2013, 16:38:36
-------------------------------------------------------------------------------------------
